5/26/2021

A good reference

Your turn

What and why Git ?

Git - History Tracker

The life of your software is recorded from the beginning.

  • at any moment you can revert to a previous revision

  • the history is browseable, you can inspect any revision

    • when was it done ?
    • who wrote it ?
    • what was changed ?
    • why ?
    • in which context ?
  • all the deleted content remains accessible in the history

Git - Team Collaboration

Git helps you to:

  • share a collection of files with your team
  • merge changes done by other users
  • ensure that nothing is accidentally overwritten
  • know who you must blame when something is broken

Git - Multiple Version Maintenance

You may have multiple variants of the same software, materialized as branches, for example:

  • a main branch
  • a maintenance branch (to provide bugfixes in older releases)
  • a development branch (to make disruptive changes)
  • a release branch (to freeze code before a new release)

Git will help you to:

  • handle multiple branches concurrently
  • merge changes from a branch into another one

Git - External Contribution Manager

Git helps working with third-party contributors:

  • it gives them visibility of what is happening in the project
  • it helps them to submit changes (patches) and it helps you to integrate these patches
  • forking the development of a software and merging it back into mainline

Install Git

Windows macOS Linux
Git for Windows xcode-select --install sudo apt-get install git (Ubuntu or Debian) sudo yum install git (Fedora or RedHat)

Note for Windows users

  • When asked about “Adjusting your PATH environment”, make sure to select “Git from the command line and also from 3rd-party software”. Otherwise, we believe it is good to accept the defaults.
  • Note that RStudio for Windows prefers for Git to be installed below C:/Program Files and this appears to be the default. Unless you have specific reasons to otherwise, follow this convention.

Introduce yourself

In the shell:

git config --global user.name 'Jane Doe'
git config --global user.email 'jane@example.com'

substituting your name and the email associated with your GitHub account.

Your turn

The Git Jargon

Repository

Git is a version control system whose original purpose was to help developers work collaboratively on big software projects. Git manages the evolution of a set of files called a repository (repo).

For new or existing projects, we recommend that you:

  • Dedicate a local directory to it.
  • Make it a Git repository.

How often and when should I do that?

  • happens once per project.
  • can happen at project inception or at any later point.

Should I then be afraid of this new folder linked to Git ?

  • Chances are your existing projects each already live in a dedicated directory.
  • Making such a directory a Git repository boils down to allowing Git to leave notes for itself in hidden files or directories.
  • The project is still a regular directory on your computer, that you can locate, name, move, and generally interact with as you wish. You don’t have to handle it with special gloves!

Commits

The daily workflow is probably not dramatically different from what you do currently. You work in the usual way, writing R scripts or authoring reports in LaTeX or R Markdown. But instead of only saving individual files, periodically you make a commit, which takes a snapshot of all the files in the entire project.

Chances are you are already into Git practices without knowing it

Task Purpose Most researchers’ approach Git approach
Versioning You wrote a file which is at a version that is significant to you and that you might want to inspect or revert to later. Or you are reviewing a version of someone else’s document. Append your initials and the date at the end of the current file name. Make a commit.
Backuping, Sharing You optimized and tested exhaustively your code. You want to release it. It is a so important version that you want to keep it for archive and make sure it can never accidentally be lost. In addition to a saved copy on your computer, save it also on an external hard drive, a cloud such as CNRS Seafile or UNCLOUD. Periodically push commits to GitHub.

Structure of a commit

Commit history of a single file

Tags. You can also designate certain snapshots as special with a tag, which is a name of your choosing. In a software project, it is typical to tag a release with its version, e.g., “v1.0.3”. For a manuscript or analytical project, you might tag the version submitted to a journal or transmitted to external collaborators.

Working locally

Create a new repository

git init myrepository # Creates the directory myrepository and make it a Git repo
git init # Makes the current directory a Git repo

This command creates the directory myrepository.

  • The repository is located in the hidden folder myrepository/.git/
  • The (initially empty) working copy is located in the folder myrepository/
git init macs
Initialized empty Git repository in /Users/stamm-a/Softs/git-workshop/macs/.git/
ls -a macs
.
..
.git
ls macs/.git/
HEAD
config
description
hooks
info
objects
refs

Inspect current state of your Git repo

cd macs
git status # show the status of the index and working copy
On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)
echo 'Hello World!' > hello
git status
On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    hello

nothing added to commit but untracked files present (use "git add" to track)

Commit your first files

git add files # Copy files into the index
git commit [-m message] # Commits the content of the index
git add hello
git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

    new file:   hello
git commit -m "added file 'hello'"
[master (root-commit) 60b463c] added file 'hello'
 1 file changed, 1 insertion(+)
 create mode 100644 hello
git status
On branch master
nothing to commit, working tree clean

Staging area

Update a file

echo "Happy" >> hello
git commit
On branch master
Changes not staged for commit:
    modified:   hello

no changes added to commit
git add hello
git commit -m "added Happy to hello content"
[master c14105a] added Happy to hello content
 1 file changed, 1 insertion(+)

Deleting a file

git rm file # remove the file from the index and from the working copy
git commit # commit the index
git rm hello
rm 'hello'
git commit -m "removed hello"
[master 13db40e] removed hello
 1 file changed, 2 deletions(-)
 delete mode 100644 hello

Showing differences

git diff [rev_a [rev_b]] [-- path ...]

Shows the differences between two revisions rev_a and rev_b. By default: - rev_a is the index, - rev_b is the working copy.

git diff --staged [rev_a] [-- path ...]

Shows the differences between rev_a and the index. By default: - rev_a is HEAD (symbolic reference to last commit).

About git diff and the index

Diff examples

echo foo >> hello
git add hello
echo bar >> hello
git diff
diff --git a/hello b/hello
index 257cc56..3bd1f0e 100644
--- a/hello
+++ b/hello
@@ -1 +1,2 @@
 foo
+bar
git diff --staged
diff --git a/hello b/hello
new file mode 100644
index 0000000..257cc56
--- /dev/null
+++ b/hello
@@ -0,0 +1 @@
+foo
git diff HEAD
diff --git a/hello b/hello
new file mode 100644
index 0000000..3bd1f0e
--- /dev/null
+++ b/hello
@@ -0,0 +1,2 @@
+foo
+bar

Reset changes

git reset [--hard] [-- path ...]
  • git reset drops the changes staged into the index (restores files as they were in last commit),
  • git reset --hard drops all the changes in the index and in the working copy.
git checkout -- path

This command restores a file (or directory) as it appears in the index (thus it drops all unstaged changes).

git diff HEAD
diff --git a/hello b/hello
new file mode 100644
index 0000000..3bd1f0e
--- /dev/null
+++ b/hello
@@ -0,0 +1,2 @@
+foo
+bar
git checkout -- .
git diff HEAD
diff --git a/hello b/hello
new file mode 100644
index 0000000..257cc56
--- /dev/null
+++ b/hello
@@ -0,0 +1 @@
+foo

History

git log
commit 13db40ee532c975030cbd04afe3180aaa5b04b9e
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date:   Mon May 31 11:58:58 2021 +0200

    removed hello

commit c14105a3ead265bdb97ca355a12aebd6d874fea5
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date:   Mon May 31 11:58:58 2021 +0200

    added Happy to hello content

commit 60b463c6f04ea3bf069f279ad174c0571a0b0fb1
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date:   Mon May 31 11:58:58 2021 +0200

    added file 'hello'

Other local commands

Commit details:

git show
commit 13db40ee532c975030cbd04afe3180aaa5b04b9e
Author: Aymeric Stamm <aymeric.stamm@math.cnrs.fr>
Date:   Mon May 31 11:58:58 2021 +0200

    removed hello

diff --git a/hello b/hello
deleted file mode 100644
index ebc17f3..0000000
--- a/hello
+++ /dev/null
@@ -1,2 +0,0 @@
-Hello World!
-Happy
git mv # move or rename a file
git tag # create or delete tags

Git Clients

A picture is worth a thousand words

Which do you prefer ?

Recommended Git clients

Your turn

Going live

Ways to go live

GitHub

Go to https://github.com and make sure you are logged in.

Click green “New repository” button. Or, if you are on your own profile page, click on “Repositories”, then click the green “New” button.

  • Repository name: myrepo (or whatever you wish)
  • Public
  • YES Initialize this repository with a README

Click the big green button “Create repository.”

Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button. Or copy the SSH URL if you chose to set up SSH keys.

GitLab and Bitbucket

Refer to (https://happygitwithr.com/new-github-first.html#make-a-repo-on-github-2).

Connect to GitHub

It is likely that your first push leads to a challenge for your GitHub username and password. This will drive you crazy in the long-run and make you reluctant to push. You want to eliminate this annoyance.

You should set up once and for all a Personal Access Token (PAT).

Use it as password the first time a Git command prompts you for your credentials.

GitHub will no longer bother you with credentials after that.

Main recommendations

  • Adopt HTTPS as your Git transport protocol
  • Turn on two-factor authentication for your GitHub account
  • Use a personal access token (PAT) for all Git remote operations: https://github.com/settings/tokens
  • Allow tools to store and retrieve your credentials from the Git credential store.

DEMO

Your turn

  • Download the slides: https://github.com/astamm/git-workshop/blob/master/git-workshop-build.html
  • Open the HTML slides in Google Chrome
  • Open a terminal (macOS or Linux) or Git Bash app on Windows and introduce yourself to Git
  • Install GitKraken and open it as well (take time to answer the initial questions)
  • Go to GitHub and fork the astamm/git-workshop repo in your account
  • Clone the repo from your account on your computer (via GitKraken)

Different contexts

We create a new project, with the preferred “GitHub first, then RStudio” sequence. Why do we prefer this? Because this method of copying the Project from GitHub to your computer also sets up the local Git repository for immediate pulling and pushing. In the absence of other constraints, I suggest that all of your R projects have exactly this set-up.

This is the main approach if you already have a local existing project that you want to bring on GitHub.

An explicit workflow for connecting an existing local R project to GitHub, when for some reason you cannot or don’t want to do a “GitHub first” workflow. When does this come up? Example: it’s an existing project that is already a Git repo with a history you care about. Then you have to do this properly.

  • Contribute to an existing repo.

Own, clone, fork

Good Habits - Day 1 - Work in branches

  1. Connect a local directory to the remote team repo https://github.com/astamm/macs.git

    Team Member External Collaborator
    git clone https://github.com/astamm/macs.git some_local_folder Forking process
  2. From the master branch, retrieve last modifications made to master branch via git pull;

  3. From the master branch, create another branch my_awesome_feature for implementing your brand new feature in the software via git checkout -b my_awesome_feature;

  4. Write down your code locally;

  5. Stage (git add to transfer to staging area) edited files;

  6. Commit (git commit [-m msg] to register staged work);

  7. Push (git push to send to remote branch);

  8. Rinse and repeat from Step 4.

Good Habits - Day 2 - Stay tuned

This is the recommended workflow when you are already developing a new feature in a branch.

  1. Put your local folder on master branch via git checkout master;
  2. From the master branch, retrieve last modifications made to master branch via git pull;
  3. From the my_awesome_feature branch, retrieve last modifications made to master branch via git merge master;
  4. Write down your code locally;
  5. Stage (git add to transfer to staging area) edited files;
  6. Commit (git commit [-m msg] to register staged work);
  7. Push (git push to send to remote branch);
  8. Rinse and repeat from Step 4.

Good Habits - Day 42 - Make pull requests

When you are satisfied with your implementation of the feature:

  1. you create a pull request (PR) for integrating your work in the original master branch;
  2. it is then reviewed by the maintainer and whoever you assigned to your PR;
  3. after some comment iterations between him/her and you, (s)he merges the PR;
  4. you can move back on the master and do a git pull to make sure your local master matches the updated remote master.
  5. you can start again with a new branch for a new feature.

A note on forking

When you are not part of the team, you can still contribute by forking the remote repo from GitHub.

  1. Forking makes a copy of the master remote into your GitHub account under the same repo name;

  2. When you git clone it, the remote master origin/master points to the forked master, which is why you can then git push to it;

  3. The forked master should never be modified;

  4. If you want to stay tuned (keep track) with the latest changes in original master, you need to manually add a remote via get remote add. By convention, the original repo should be called upstream;

    git remote add upstream https://github.com/OWNER/REPO.git
  5. Now you can stay up to date whenever you want by going to your master branch and execute

    git pull upstream master

    Then you will git push to save the modifications also on your forked master.

  6. Finally you can switch back to your branch via git checkout my_awesome_branch and git merge master to make also your branch up to date.

Good habits - Summing up

Remember that the origin master branch is the one used by the world to install your software.

Four rules to adopt that will make your life easy

  • new feature = new branch;
  • always pull before working + merge from master to stay up to date with the latest approved changes in the software;
  • propose pull request to integrate your work officially in the software.
  • never modify a forked master branch.

In other words:

  • before a coding session, make your masters and branches up to date.
  • at the end of your coding section, every file that has been created or modified should be committed and pushed in the corresponding branch.

Your turn

  • Download the slides: https://github.com/astamm/git-workshop/blob/master/git-workshop-build.html

  • Open the HTML slides in Google Chrome

  • Open a terminal (macOS or Linux) or Git Bash app on Windows and introduce yourself to Git

  • Install GitKraken and open it as well (take time to answer the initial questions)

  • Go to GitHub and fork the astamm/git-workshop repo in your account

  • From GitKraken

    • Clone the repo from your account on your computer (via GitKraken)
    • Create a branch YOU-intro, where YOU should be replaced with your name (lower case)
    • Switch to this branch
  • Add a .txt file that contains your name, email adress and computer skill learning interests.

  • Stage, commit and push the file on the branch YOU-intro (from GitKraken)

  • Make a pull request to my workshop repo (from GitHub)

The Maintainer

Do this once per new project.

Go to https://github.com and make sure you are logged in.

Click green “New repository” button. Or, if you are on your own profile page, click on “Repositories”, then click the green “New” button.

  • Repository name: myrepo (or whatever you wish)
  • Public
  • YES Initialize this repository with a README

Click the big green button “Create repository.”

Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button. Or copy the SSH URL if you chose to set up SSH keys.

In a Bash terminal, type

git clone paste_from_clipboard folder_you_want_to_clone_into/

Side note for LMJL members

If you want to experiment team work, I can set up team GitHub projects and add you as members.

R, RStudio, Git

The usethis package

  • Tweak your .Rprofile setup with use_devtools() and edit_r_profile()
  • Create a new package instantly with create_package() with automatic setup of roxygen for handling documentation
  • Choose a license with use_XXX_license()
  • Set up your package directory as a Git repo: use_git()
  • Host it on GitHub with use_github()
  • Add package dependencies with use_package()
  • Add data to your package with use_data()
  • Add global documentation for your package with use_package_doc()
  • Add an README file as a showcase as an R markdown file with use_readme_rmd()
  • Add a NEWS.md file for reporting novelties at each release with use_news_md()
  • Add vignettes to detail parts of your package with use_vignette()
  • Add unit testing to your package for every function you write with use_testthat() and use_test()
  • Run the tests with devtools::test()

RStudio API for Git operations

  • Use RStudio API helpers to document your functions
  • Rebuild setup

DEMO